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ABSTRACT 

With the recent surge of social networks like Facebook, new 
forms of recommendations have become possible - personal- 
ized recommendations of ads, content, and even new social 
and product connections based on one's social interactions. 
In this paper, we study whether "social recommendations" , 
or recommendations that utilize a user's social network, can 
be made without disclosing sensitive links between users. 
More precisely, we quantify the loss in utility when existing 
recommendation algorithms are modified to satisfy a strong 
notion of privacy called differential privacy. We propose 
lower bounds on the minimum loss in utility for any recom- 
mendation algorithm that is differentially private. We also 
propose two recommendation algorithms that satisfy differ- 
ential privacy, analyze their performance in comparison to 
the lower bound, both analytically and experimentally, and 
show that good private social recommendations are feasible 
only for a few users in the social network or for a lenient 
setting of privacy parameters. 

1. INTRODUCTION 

Making recommendations or suggestions to users to in- 
crease their degree of engagement is a common practice for 
websites. For instance, Facebook recommends friends to 
existing users, Amazon suggests products, and Netflix rec- 
ommends movies, in each case with the goal of making as 
relevant a recommendation to the user as possible. Recom- 
mending the right content, product, or ad to an individual is 
one of the most important tasks in today's web companies. 
With the boom in social networking many companies are 
striving to incorporate the likes and dislikes of an individ- 
ual's social neighborhood. There has been much research 
and industrial activity to solve two problems: (a) recom- 
mending content, products, ads not only based on the indi- 
vidual's prior history but also based on the history of those 
the individual trusts |12| [5] , and (b) recommending others 
whom the individual might trust. Recommendations based 
on social connections are especially effective for users who 
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have seen very few movies, bought only a couple of products, 
or never clicked on ads; while traditional recommender sys- 
tems default to generic recommendations, a social-network 
aware system can provide useful recommendations based on 
active friends. Companies like TrustedOpiniorQand SoMR^] 
generate content and ad recommendations by leveraging so- 
cial networks. In fact, FaceboolJ^] Yahooj^] and Googk-Jjare 
opening their social networks to third party developers to 
encourage social network-aware recommender systems. 

In addition, a social network might want to use a different 
underlying social network, such as one derived from e-mail 
records or Instant Messenger connections, to suggest friends 
(e.g. Facebook already uses contacts imported from an ad- 
dress book as suggestions) . Social connections could also be 
used to recommend products or advertisements to users — 
Netflix (or Opentable or Yelp) could recommend movies (or 
restaurants) to a subscriber based on her friends' activities 
and ratings. In fact, rather than using the entire social 
graph, the system could use only a subset of trusted edges 
for that application (for instance, a user might only trust 
the movie recommendations of a subset of her friends). 

However, these improved recommendations based on so- 
cial connections come at a cost - a recommendation can 
potentially lead to a privacy breach by revealing sensitive 
information. For instance, while the social network links 
might be public, both the user-product links and the user- 
user-trust links must be kept secret. (Knowing that your 
friend doesn't trust your judgement about books might be 
a breach of privacy). Similarly, revealing an edge in an e- 
mail graph, or revealing that a particular user purchased 
a sensitive product, constitutes a potentially serious breach 
of user privacy. Recommendations can indeed lead to such 
privacy breaches even without the use of social connections 
in the recommendation algorithm [H] . The privacy concerns 
posed by recommender systems and use of the social network 
graph have been at the forefront of industry discussion on 
the topic. In 2007, Facebook attempted to incorporate the 
product purchases made by one's friends into the stream of 
news one receives while visiting the site through a product 
called Beacon. Their launch showed that people interact 
with many websites and products in a way that they would 
not want their friends to know about, leading to several 
privacy lawsuits, and an eventual complete removal of the 
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feature by Facebook. 

In this paper, we present the first theoretical study of 
the privacy-utility trade-offs in social recommender systems. 
While there are many different settings where social recom- 
mendations are used (friend/product recommendations, or 
trust propagation) , and each leads to a slightly different for- 
mulation of the privacy problem (the sensitive information 
is different in each case), all these problems have the fol- 
lowing common structure - recommendations are made on 
a graph where some subset of edges are sensitive. For clar- 
ity of exposition, we ignore (by and large) scenario specific 
constraints, and focus on the following general model. We 
consider a graph where all the edges are sensitive, and an 
algorithm that recommends a single node v in the graph 
to some target node u. We assume that the algorithm is 
based on a utility function that encodes the "goodness" of 
recommending each node in the graph to this target node. 
Suggestions for utility functions include number of common 
neighbors, weighted paths and PageRank distributions [21] . 
We consider an attacker who wishes to deduce the existence 
of a single edge (x, y) in the graph by passively observing the 
recommendation (v, it). We measure the privacy of the algo- 
rithm using differential privacy - the ratio of the likelihoods 
of the algorithm recommending (v,u) on the graphs with 
the edge (x, y) and without the edge (x,y), respectively. In 
this setting, we ask the question: to what extent can edge 
recommendations be accurate while preserving differential 
privacy? 

Our Contributions and Overview. In this paper we 
present the following results on the accuracy of differentially 
private social recommendations. 

• We present a trade-off between the accuracy and pri- 
vacy of any social recommendation algorithm that is 
based on any general utility function. This trade-off 
shows an inevitable lower bound on the privacy pa- 
rameter e that must be incurred by an algorithm that 
wishes to guarantee any constant-factor approximation 
of the maximum utility. (Section [4| 

• We present lower bounds on accuracy and privacy for 
algorithms based on specific utility functions previ- 
ously suggested for recommending edges in a social 
network - number of common neighbors and weighted 

. Our trade-offs for these specific utility func- 
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tions present stronger lower bounds than the general 
one that is applicable for all utility functions. (Section 

• We adapt two well known privacy preserving algo- 
rithms from the differential privacy literature for the 
problem of social recommendations. The first (which 
we call Laplace), is based on adding random noise 
drawn from a Laplace distribution to the utility vector 
[8] and then recommending the highest utility node. 
The second (Exponential), is based on exponential smooth- 
ing |15| . We analyze and compare the accuracy of the 
two algorithms and comment on their relative merits. 
(Section [6} 

• We perform experiments on a real graph using the 
number of common neighbors utility function. The 
experiments compare the algorithms Laplace, Expo- 
nential, and our lower bound. Our experiments sug- 
gest three takeaways: (i) For most nodes, the lower 



bounds suggest that there is a huge inevitable trade- 
off between privacy and accuracy when making social 
recommendations; (ii) The more natural Laplace algo- 
rithm performs as well as Exponential; and (iii) For a 
large fraction of nodes, both Laplace and Exponential 
almost achieve the maximum accuracy level suggested 
by our theoretical lower bound. (Section [Tf 

• We briefly consider the setting when an algorithm may 
not know (or be able to compute efficiently) the entire 
utility vector. We recognize that both Laplace and Ex- 
ponential algorithms assume the knowledge of all the 
utilities (for every node) when recommending to a tar- 
get node. We propose and analyze a sampling based 
linear smoothing algorithm that does not require all 
utilities to be pre-computed. We conclude by men- 
tioning several directions for future work. (Section [8J 

We now discuss related work and then formalize the mod- 
els and definitions in Section [3] 

2. RELATED WORK 

Several papers propose that the social connections avail- 
able can be effectively utilized for enhancing online appli- 
cations [12] [2]. Golbeck [To] uses the trust relationships 
expressed through social connections for personalized movie 
recommendations and shows that the accuracy of the ratings 
outperform those produced by a collaborative filtering algo- 
rithm not utilizing the social graph. Mislove et al. [16] at- 
tempt an integration of web search with social networks and 
explore the use of trust relationships, such as social links, to 
thwart unwanted communication [17] . Approaches incorpo- 
rating trust models into recommender systems are gaining 
momentum both in academic research [25 , [18], [23], and in 
real products. Examples include, Chorus j which provides 
social app recommendations for the iPhone; Fruggo.corrj^] 
a social e-commerce site; and WellNet'sj^] online social net- 
working program for health care coordinatiorQ 

Calandrino et al. [5] demonstrate that algorithms that 
recommend products based on a friends' purchases have very 
practical privacy concerns: "passive observations of Ama- 
zon. corn's recommendations are sufficient to make valid in- 
ferences about individuals' purchase histories" . McSherry et 
al. [14] show how to adapt the leading algorithms used in the 
Kctflix prize for movie recommendations to make privacy- 
preserving recommendations. Their work does not apply 
to algorithms that rely on the underlying social graph be- 
tween users, as the user-user connections have not been re- 
leased as part of the Netflix competition. Ai'meur et al. [I] 
study the problem of personalized recommendations in gen- 
eral. Dwork et al. [9] pose the problem of constructing 
differentially private analysis of social networks. Toubiana 
et al. 24 propose a framework for privacy preserving tar- 
geted advertising - while targeting based on user history 
is considered, targeting based on social interactions is not 
considered. 

A related and independent work Ul considers the problem 
of mining top-k frequent item-set. Although they consider 
mechanisms analogous to the ones we propose, since we solve 

http : / /www . envionetworks . com/ 



http : //f ruugo . com/ 



http : //www. wellnet . com/ 



"www .medic alnewstoday . com/ art icles/1 18628 .php 



two different problems, the focus of their analysis, notion of 
utility, and conclusions substantially differ from ours. 

3. MODEL 

In this section, we describe the model for privacy-preserving 
social recommendations. We first define a social recommen- 
dation algorithm and then mention notions of monotonicity 
and accuracy of an algorithm. We then define axioms fol- 
lowed by typical utility functions that such algorithms are 
based on. Finally, we define differential privacy. 

3.1 Social Recommendation Algorithm 

Let G = (V, E) be the graph that describes the social net- 
work. Each recommendation is an edge (i,r), where node i 
is recommended to the target node r. Given a graph G, and 
a target node r, we denote the utility of recommending node 
i to node r by uf' r . Further, we assume that a recommen- 
dation algorithm R is a probability vector on all nodes. Let 
pf' r denote the probability of recommending node i to node 
r in graph G by a specified algorithm. When the graph G 
and the source node r are clear from context, we drop G and 
r from the notation - m denotes the utility of recommend- 
ing i, and Pi denotes the probability that R recommends i. 
We further define « max = max, Ui . 

We consider algorithms that attempt to maximize the ex- 
pected utility ' Pi) °f each recommendation. If we 
assume (without loss of generality) that the utility of the 
least useful recommendation is 0, the accuracy of such an 
algorithm can be defined as: 

Definition 1 (Accuracy). An algorithm A is said to 
be (1 — S)-accurate if given any set of utilities Ui (for all i) 
denoted by u, A recommends node i with probability Pi such 
that {1-8) =min a 21^. 

Therefore, an algorithm is said to be (1 — <5)-accurate if for 
any utility vector, the algorithm's expected utility is at least 
(1— 5) times the utility of the highest utility node in the given 
utility vector. It is easy to check that for the case when the 
utility of the least useful recommendation is tt m i n , in all of 
our subsequent discussions, the definition of accuracy we 
use is equivalent to accuracy defined as the fraction of the 
difference between tt max and u m i n . 

Scale Invariance of Sensitivity and Utility Func- 
tions. We initiate a small discussion on what happens 
when the utility values for all potential recommendable nodes 
are scaled by a multiplicative factor, or changed by an addi- 
tive constant. Intuitively, since the scale of utilities is chosen 
arbitrarily, one would expect the algorithms and the anal- 
ysis to be invariant to such numeric changes. However, be- 
cause of the constraints imposed by the desire to be privacy- 
preserving, where the privacy-preservation is with respect to 
a presence or absence of a particular edge, the scale invari- 
ance assumptions require a more careful articulation. In 
particular, the crucial point of interaction between the pri- 
vacy requirement and the utility function is the concept of 
sensitivity, denoted by A/, which is the maximum change 
in a utility vector u that can occur due to an addition or 
removal of one edge in the graph. Observe that if we scale a 
utility function by a multiplicative constant, the sensitivity 
of the utility function is scaled as well by the same constant. 
Without loss of generality, and for ease of subsequent expo- 
sition, we assume that A/ = 1, an assumption that implies 



that the magnitudes of the utilities are now meaningful, as 
the higher utility magnitude corresponds to more edges that 
need to be added or removed in the graph in order to achieve 
it. Equivalently, we could have chosen to let the utilities be 
scale invariant, but would then need to compute and reason 
in terms of the sensitivity of the utility function. 

Another property that is natural of a recommendation 
algorithm is monotonicity: 

Definition 2 (Monotonicity). An algorithm is said 
to be monotonic t/Vi,j, Ui > Uj implies that pi > Pj. 

3.2 Axioms on Utility Functions 

We now define two axioms that we believe should be sat- 
isfied by any meaningful utility function in the context of 
recommendations on a social network. These axioms are 
later used in proving our theoretical results. Our axioms are 
inspired by the work of [51] and the specific utility functions 
they consider, which include: number of common neighbors, 
sum of weighted paths, and PageRank based utility mea- 
sures. 

Axiom 1 (Exchangeability). Let G be a graph and 
let h be an isomorphism on the nodes giving graph Gu, s.t. 
for target node r, h(r) — r. Then Vi : uf ,T — u^hf . 

This axiom captures the intuition that the utility of a 
node i should not depend on the node's name. Rather, its 
utility with respect to target node r only depends on the 
structural properties of the graph, and so, nodes that are 
isomorphic from the perspective of the target node r should 
have the same utility. 

Axiom 2 (Concentration Axiom). There exists S c 
V(G), such that \S\ = (3, and J2i £ s Ui ^ Sisv(G) Ui - 

This says that there are some (3 nodes that together have 
at least a constant fraction of the total utility mass. This ax- 
iom is likely to be satisfied for small enough /3, since usually 
there are some nodes that are very good for recommendation 
and many that are not so good. 

In the subsequent lower bound sections, we only consider 
monotonic algorithms for utility functions that satisfy the 
exchangeability axiom as well as the concentration axiom 
for a reasonable choice of /3. 

A running example throughout the paper of a utility func- 
tion that satisfies these axioms in practical settings and is 
often deployed [2l] is that of the number of common neigh- 
bors utility function: given a target node r and a graph 
G, the common neighbors utility metric assigns a utility 
uf' r = c(i, r), where c(i, r) is the number of common neigh- 
bors between i and r. 

3.3 Differential privacy 

Differential privacy [6j is a strong definition of privacy 
that is based on the following principle: an algorithm pre- 
serves the privacy of an entity if the algorithm's output is 
not sensitive to the presence or absence of the entity's infor- 
mation in the input data set. In our setting of graph-based 
social recommendations, we wish to maintain the presence 
(or absence) of an edge in the graph private. Hence, the 
privacy definition can be formally stated as follows. 



Definition 3. A recommendation algorithm R satisfies 
e- differential privacy if for any pair of graphs G and G' that 
differ in one edge (i.e., G = G' + {e} or vice versa) and 
every set of possible recommendations S, 

Pr[R{G) e S}< exp(e) x Pr[R(G') e S] (1) 

where the probabilities are over the random coins of R. 

Differential privacy has been widely used in the privacy liter- 
ature [3] [8j 1 1 3| 15) [7] , since it is even resilient to adversaries 
who know all but one edges in the graph, and guarantees 
privacy for multiple runs of the algorithm. While weaker 
notions of privacy have also been considered in the litera- 
ture, in this paper we focus on the strong differential privacy 
definition only. Since in social recommendations protecting 
privacy is extremely important, it seems reasonable to first 
explore and understand the strongest notions of privacy. 

In this paper, we only consider the utility of a single social 
recommendation. We note that in this setting, we can relax 
the differential privacy definition such that Equation [T] only 
holds for graphs G and G' that differ in an edge e that is 
not incident on r, the target of the recommendation. This 
mirrors the natural setting where (a) one recommendation 
is made to the attacker (r), (b) only the target node (the 
attacker) sees the recommendation. By considering G and 
G' that differ in e = (i, r), the adversary can only learn 
about his neighborhood (which he is aware of to start with) 
and not learn whether two legitimate nodes in the graph 
are connected. While we consider a single recommendation 
throughout the paper, we use the relaxed variant of differ- 
ential privacy only in Sections [5] and [7] 

4. GENERAL LOWER BOUND 

In this section we prove a lower bound on the privacy 
parameter e on any differentially private recommendation 
algorithm that (a) achieves a constant accuracy and (b) is 
based on any utility function that satisfies the exchangeabil- 
ity and concentration. 

Let us first sketch the proof technique for the lower bound 
using the number of common neighbors utility metric, and 
then state the lower bound for a general utility metric. An 
interested reader can find the full proofs in the Appendix. 
Recall that given a target node r and a graph G, the common 
neighbors utility metric assigns a utility u i ' r = c(i, r), where 
c(i, r) is the number of common neighbors between i and r. 
The nodes in any graph can be split into two groups - V^, 
nodes which have a high utility for the target node r and 
Vf , nodes that have a low utility. In the case of common 
neighbors, all nodes i in the 2-hop neighborhood of r (who 
have at least one common neighbor with r) can be part of 
V£i and the rest in V[ a . Since the recommendation algorithm 
has to achieve a constant accuracy, it has to recommend one 
of the high utility nodes with constant probability. 

By the concentration axiom, there are only a few nodes 
in V£i, but there are many nodes in V[ a \ in the case of com- 
mon neighbors, node r may only have 10s or 100s of 2-hop 
neighbors in a graph of millions of users. Hence, there exists 
a node i in the high utility group and a node I in the low 
utility group such that Y = Pi/pe is very large (fi(n)). At 
this point, we show that we can carefully modify the graph 
G by adding and/or deleting a small number (t) of edges in 
such a way that the node I with the smallest probability of 
being recommended in G becomes the node with the highest 



utility in G' . By the exchangeability axiom, we can show 
that there always exist some t edges that make this possi- 
ble. For instance in the common neighbors case, we can do 
this by adding edges between a node i and t of r's neigh- 
bors, where t > max, c(i, r). It now follows from differential 
privacy that 

e > -logT 
- t s 

More generally, let be the set of nodes 1, . . . , k each of 
which have utility V4 > (1 — c)tt max , and let V[ a be the nodes 
k + 1, . . . ,n each of which have Ui < (1 — c)u max utility of 
being recommended to target node r. Recall that « max is 
the utility of the highest utility node. Let t be the number 
of edge alterations required to turn a node with the smallest 
probability of being recommended from the low utility group 
into a node of maximum utility in the modified graph. The 
following lemma states the main trade-off relationship be- 
tween the accuracy parameter 8 and the privacy parameter 
e of a recommendation algorithm. 

Lemma 1. e> f (ln(^) + ln(f=f )) 

This lemma gives us a lower bound on the privacy guar- 
antee e in terms of the utility parameter 8. Equivalently, 



COROLLARY 1. 1 - 5 < 1 - w _ fc C j^ )e « t 

By using the concentration axiom with parameter j3 we 
can prove the following. 

Lemma 2. For (1 — 8) = fi(l) and j3 = o(n/ log n), 



e > 



log n — o(log n) 



This expression can be intuitively interpreted as follows: 
in order to achieve good accuracy with a reasonable amount 
of privacy (where e is independent of n), either the number 
of nodes with high utility needs to be very large (i.e. j3 needs 
to be very large, Q(n/ log n)), or the number of steps needed 
to bring up any node's utility to the highest utility needs to 
be large (i.e. t needs to be large, fi(logn)). 

We shall use this relationship from Lemma [2] in the sub- 
sequent section to prove stronger lower bounds for specific 
utility functions. Below we mention a generic lower bound 
that applies to any utility function. Note that we only need 
an upper bound on t. The tighter upper bound we are able 
to prove on t, the better lower bound we get for e. 

Using the exchangeability axiom, we can show that t < 
4 * dmax in any graph. Consider the highest utility node and 
the lowest utility node, say x and y respectively. These 
nodes can be interchanged by deleting all of x's current 
edges, adding edges from x to y's neighbors, and doing the 
same for y. This requires at most 4 * d max changes. Hence, 

Theorem 1. For a graph with maximum degree d max = 
ct log n, a differentially private algorithm can guarantee con- 
stant accuracy only if 



e > 



1 



o(l) 



(3) 



In the next section, we present stronger lower bounds for 
two well studied utility functions - common neighbors and 
weighted paths. 



5. LOWER BOUNDS FOR SPECIFIC UTIL- 
ITY FUNCTIONS 

In this section, we start from Lemma[2]and prove stronger 
lower bounds for specific utility functions by proving stronger 
upper bounds on t. Proofs and more details can be found 
in the Appendix. 

5.1 Common neighbors lower bound 

Consider a graph and a target node r. As we saw in the 
previous section, we can make any node x have the highest 
utility by adding edges to all of r's neighbors. If d r is r's 
degree, it suffices to add t — d r + 0(l) edges to make a node 
the highest utility node. We state the following theorem for 
a more general version of common neighbors utility function 
below. 

Theorem 2. Let U be a utility function that depends only 
on and is monotonically increasing with c(x,y), the number 
of common neighbors between x and y. A recommendation 
algorithm based on U that guarantees any constant approx- 
imation to utility for target node r has a lower bound on 
privacy given by e > where d r = alogn. 

As we will show in Section [7j this is a very strong lower 
bound. Since a significant fraction of nodes in real-world 
graphs have small d r (due to a power law degree distribu- 
tion), we can expect no algorithm based on common neigh- 
bors utility to be both accurate on most nodes and satisfy 
differential with a reasonable e. 

5.2 Weighted Paths 

A natural extension of the common neighbors utility func- 
tion and one whose usefulness is supported by the literature 
|21| , is the weighted path utility function, defined as 
score(s,y) = J2\= 2 l^P^fly) I . 



exponential smoothing 15 , to our problem. For the pur- 



y \ \ denotes the number of length I paths from 



where \paths, 

s to y. Typically, one would consider using small values of 
7, such as 7 = 0.005, so that the weighted paths score is a 
"smoothed version" of the common neighbors score. 

Again let r be the target node. To make node y the highest 
utility node, we add edges such that y has cd r common 
neighbors with r. Now, the goal is to choose c > 1 such 
that this alone is sufficient to ensure that y has the highest 
utility. This is done by showing that (a) no other node 
has more than d r common neighbors with r, and (b) the 
utility derived from paths of length > 3 cannot offset the 
additional common neighbors between y and r (for suitably 
small 7). Finally, we show that this requires adding only 
t<dr + 2*(c-l) + 0(l). 

Theorem 3. For weighted paths based utility functions 
with parameter 7, we have t < (1 + o(l))d r when making 
recommendations for node r, i/7 = o( . )■ Therefore, for 
an algorithm to guarantee constant approximation to utility, 
the privacy must be e > — (1 — o(l)) where d r = alogn. 

6. PRIVACY-PRESERVING RECOMMENDA- 
TION ALGORITHMS 

There has been a wealth of literature on developing differ- 
entially private algorithms (3) [8 15 . In this section we will 



pose of this section, we will assume that given a graph and 
a target node, our algorithm has access to (or can efficiently 
compute) the utilities Ui for all other nodes in the graph. 
Given this vector of utilities, our goal is to compute a vec- 
tor of probabilities pi such that (a) £\ Ui ■ pi is maximized, 
and (b) differential privacy is satisfied. 

Clearly, maximum accuracy is achieved by recommending 
the node with utility u max . However, it is well known that 
any algorithm that satisfies differential privacy must recom- 
mend every node, even the ones that have zero utility, with a 
non-zero probability [20] . Indeed, suppose for graph G and 
target node r, an algorithm assigns probability to some 
node x with utility u^' r and a positive probability to some 
node y, with utility Uy' r . Transform G into G' as follows: 
connect x to all of j/'s neighbors in G and disconnect x from 
all its neighbors in G. Do the same for y. This in turns cre- 
ates an isomorphism h between G and G' , where h(r) = r. 
Hence, by the exchangeability axiom, the algorithm will rec- 
ommend y with probability. Thus, there is a path from G 
to G' of length t such that p y goes from a positive number 
to 0. This leads to a breach of differential privacy. 

The following two algorithms ensure differential privacy: 

6.1 Exponential mechanism 

The exponential mechanism creates a smooth probability 
distribution from the utility vector and then samples from 
that. 

Definition 4. Exponential mechanism: Given nodes 
with utilities (ui, , Ui, , Un)> algorithm Ae{c) recom- 
mends node i with probability 

e a/" 1 / X]fc=i e a7 " fc , where e > is the privacy parameter, 
and A/ is the sensitivity of the utility functioi\ 10 \ 

Theorem 4. flftf Ae(s) guarantees e differential privacy. 

6.2 Laplace mechanism 

Unlike the exponential mechanism, the Laplace mecha- 
nism mimics the optimal mechanism. It first adds random 
noise drawn from a Laplace distribution, and like the opti- 
mal mechanism, picks the node with the maximum noise- 
infused utility. 

Definition 5. Laplace mechanism: Given nodes with 
utilities (ui, . . . , m, . . . , u n ), algorithm Ai,(e) first computes 
a modified utility vector (ui, . . . ,u' n ) as follows: u'i = m + r 
where r is a random variable chosen from the Laplace distri- 
bution with scale (%^f""1 independently at random for each 
i. Then, Ai,(e) recommends node z whose noisy utility is 
maximal among all nodes, i.e. z — arg maxi u'i . 

Theorem 5. Az,(e) guarantees e differential privacy. 

Proof. The proof follows from the privacy proof of the 
Laplace mechanism in the context of publishing privacy- 
preserving histograms 8 by observing that one could treat 
each node as a histogram bin and release the noisy count 
for the value in that bin, u'i. Since Ai(e) is effectively doing 
post-processing by releasing only the name of the bin with 
the highest noisy count, the algorithm remains private. □ 



adapt two well known tools, Laplace noise addition [8] and 
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An astute reader might remark at this point that the 
Laplace mechanism as stated does not satisfy the mono- 
tonicity property that we relied upon in our lower bound 
proofs. Indeed, the Laplace mechanism satisfies the prop- 
erty only in expectation; however, that is not an obstacle to 
our analysis since in order to meaningfully compare the per- 
formance of Laplace mechanism with other mechanisms and 
with the theoretical bound on performance, we would need 
to evaluate its expected, rather than one-time, performance. 

6.3 Exponential vs Laplace Mechanisms 

It is natural to ask whether there is an equivalence be- 
tween the two approaches of transforming a non-private al- 
gorithm to a privacy-preserving algorithm or how they would 
compare, perhaps depending on the setting. We present pre- 
liminary results on comparing the utilities when there are 
only two possible recommendations (n = 2). The theorem 
is stated below and the proof can be found in the Appendix. 

Theorem 6. LetUE andUL denote the utilities achieved 
by Ae{£) and Al(c) on input vector (u\,U2), respectively. 
Wlog, assume Ui > U 2 . ThenU E = u\ e „Q, Ui +U2- 
and 

U L = m(l - l e - E («i-«2) _ <^-^ ) + U2 (i e - e ("*-"3> + 

e(u 1 -u 2 ) \ 
4e e <"l-"2) > 

To our knowledge, in the course of the proof we give the 
first explicit closed form expression for the probabilities of 
each of the two nodes being recommended by Laplace mech- 
anism (the work of 19 gives a formula that does not apply 
to our setting). 

Although the expressions for Ue and Ul are difficult to 
compare by eye-balling, by plugging in various values of 
Ui and W2 into the formulas, one infers that the Exponen- 
tial mechanism slightly outperforms the Laplace mechanism, 
when e is very small and the difference between ui and ui is 
large. We leave it for future work to simplify these as well 
as extend the analysis to the n > 2 case. 
Implementation efficiency. The Laplace mechanism is 
more intuitive than the Exponential mechanism, and more 
likely to receive executive buy-in in a real-world environ- 
ment. Furthermore, it has the advantage that it can be 
implemented more easily than the Exponential mechanism. 
Al requires computing the noisy utilities and then selecting 
the node with the highest noisy utility, which takes linear 
time. Ae requires first computing a set of smoothed utilities 
and then sampling from the probability distribution induced 
by them, which can be accomplished in linear time using the 
alias-urn method suggested by 
practically efficiently than Al- 
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but likely slightly less 



7. UTILITY ACHIEVABLE IN PRACTICE 
ON A REAL GRAPH 

In this section we present experimental results on a real 
graph and for the # of common neighbors utility function. 
The experiments compare the algorithms Laplace, Exponen- 
tial, and our lower bound. Our experiments suggest three 
takeaways: (i) For most nodes, the lower bounds suggest 
that there is a huge inevitable trade-off between privacy 
and accuracy when making social recommendations; (ii) The 
more natural Laplace mechanism performs as well as the Ex- 
ponential mechanism; and (iii) For a large fraction of nodes, 



the accuracy achieved by Laplace and Exponential mecha- 
nisms does not substantially differ from the best possible 
accuracy suggested by our theoretical lower bound. 

7.1 Experimental Setup 

For our experiments we use the Wikipedia vote network 



11 



available from Stanford Network Analysis Packagj^j 
Some users in Wikipedia are administrators, who have ac- 
cess to additional technical features. Users are elected to 
be administrators via a public vote of other users and ad- 
ministrators. The Wikipedia vote network consists of all 
users participating in the elections (either casting a vote or 
being vote on), since inception of Wikipedia until January 
2008. We turn the network of 
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into an undirected net- 
work, where each node represents a user and an edge from 
node i to node j represents that user i voted on user j or 
user j voted on user i. This obtained network consists of 
7,115 nodes and 100,762 edges. Although the Wikipedia 
vote network is publicly available, and hence the edges in it 
are not private, we believe that the graph itself exhibits the 
structure and properties of some of the graphs in which one 
would want to preserve privacy, such as the graph of social 
connections and people's product purchases. 

For each pair of nodes in the social network, except nodes 
that share an edge, we compute the number of common 
neighbors they have in the Wikipedia vote network. Then, 
assuming we will make one recommendation for each node in 
the graph, we compute the expected accuracy of recommen- 
dation for that node. For the Exponential mechanism and 
the theoretical bound, given the utilities of recommending 
each node to a given node v, we can compute the expected 
accuracy and the theoretical bound on accuracy exactly. For 
the Laplace mechanism, we compute its expected accuracy 
by running 1, 000 independent trials of the Laplace mecha- 
nism, and averaging the utilities obtained in those trials, for 
each node in the graph p^] 

7.2 Exponential vs Laplace in practice 

We first observe in Figure [l] that for all nodes in the 
Wikipedia vote network, the Laplace mechanism achieves 
nearly identical accuracy as the Exponential mechanism. 
This confirms our hypothesis of Section [6] that Exponential 
and Laplace mechanisms are nearly equivalent in practical 
settings, and implies that one can use the more intuitive and 
easily implementable Laplace mechanism in practice. 

7.3 Social Recommendations: Good or Pri- 
vate? 

We now proceed to evaluate the accuracy of the Exponen- 
tial mechanism and compare it with the best accuracy one 
can hope to achieve using a privacy-preserving recommen- 
dation algorithm, as computed according to our theoretical 
bound of Corollary [l] 

For ease of visual presentation, we assume that we do not 
care about node identities; we number the nodes in decreas- 
ing order of the accuracy one can hope for when making 
the recommendation for that node, as predicted by the the- 
oretical bound. For each node, the graph in Figure [2] shows 
the theoretical bound and the accuracy achieved by the Ex- 

12 http : //snap . Stanford. edu/data/wiki- Vote .html 

13 Out of the 7,115 nodes, there are 60 nodes that have no 
common neighbors with anyone except nodes they are al- 
ready connected to. We omit those nodes from our analysis. 
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Figure 1: Accuracy achieved by Exponential and 
Laplace mechanisms on Wikipedia vote network using 
■#- of common neighbors as a measure of utility. The 
x-axis represents the node number, the y-axis - the ac- 
curacy of recommendation for that node. The top graph 
is for desired privacy guarantee of e = 0.1, the bottom - 
for e = 0.5. 



Figure 2: Accuracy achieved by Exponential mecha- 
nisms and predicted by theoretical bound on Wikipedia 
vote network using # of common neighbors as a measure 
of utility. The x-axis represents the node number, the y- 
axis - the expected accuracy of recommendation for that 
node. The top graph is for desired privacy guarantee of 
e = 0.1, the bottom - for e = 0.5. 



ponential mechanism. Due to our chosen numbering of the 
nodes, the theoretical bound is a smooth monotonically de- 
creasing function of the node number, whereas the achieved 
accuracy is not necessarily monotonically decreasing (and 
thus, in places, does not appear as a line). 

As can be seen in Figure [2] and Figure [3] for some nodes, 
the Exponential mechanism performs quite well, achieving 
nearly perfect accuracy. However, the number of such nodes 
is fairly small - the Exponential mechanism achieves better 
than 0.9 approximation for less than 1.5% of the nodes when 
e = 0.1 and less than 21% of the nodes when e = 0.5, it 
achieves better than 0.8 approximation for less than 2% of 
the nodes when e = 0.1 and less than 25.5% of the nodes 
when e = 0.5. This matches the intuition that by making 
the privacy requirement more lenient, one can hope to make 
better quality recommendations for more nodes; however, 
this also pinpoints the fact that for most nodes, the Expo- 
nential mechanism does not achieve good accuracy. 

Although there is a possibility that one could develop bet- 
ter privacy-preserving recommendation mechanisms than Ex- 
ponential or Laplace, this experiment shows that for a large 
number of target nodes, our theoretical bound limits the 
best accuracy one can hope to achieve privately quite severely. 
For example, for e — 0.1, no privacy-preserving algorithm 
can hope to achieve a better than 70% accuracy for more 
than 9% of the nodes. This finding throws into serious ques- 
tion the feasibility of developing social recommendation al- 
gorithms that are both accurate and privacy-preserving for 
many real- world settings. 

Finally, in practice, it is the least connected nodes that 



are likely to benefit most from receiving high quality rec- 
ommendations. However, our experiments suggest that the 
low degree nodes are also the most vulnerable to receiv- 
ing low accuracy recommendations due to needs of privacy- 
preservation: see Figure[4]for an illustration of how accuracy 
depends on the degree of the node. This further suggests 
that, in practice, one has to make a choice between preserv- 
ing accuracy vs preserving privacy. 

7.4 Are A E and A L good enough for utility func- 
tion based on common neighbors? 

As we have experimentally observed in Figure [2j the Ex- 
ponential mechanism achieves good accuracy compared to 
the best achievable accuracy predicted by our theoretical 
bound. We can formalize this statement rigorously as fol- 
lows (proved in the Appendix): 

Lemma 3. Let A e denote the accuracy of the Exponential 
mechanism, and Ao denote the upper bound on the accuracy 
that can be achieved by any privacy-preserving algorithm. 
Then, for utility functions based on the number of common 
neighbors between two nodes, > where k is the 

number of nodes with non-zero utility. 

Furthermore, 

Lemma 4. For utility vector of the form 
u = (timax, • • • , "max, 0, . . . , 0), > ^ , where k is the 
number of nodes with non-zero utility, 

For real-world graphs, we expect the number of nodes with 
non-zero utility k to be fairly small, and thus, the Expo- 
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Figure 3: Performance of the Exponential mechanisms 
and predicted by theoretical bound on Wikipedia vote 
network using # °f common neighbors as a measure of 
utility. The x-axis represents the accuracy, the y-axis - 
the expected percentage of nodes for whom that accu- 
racy of recommendation is achieved (or predicted by the 
theoretical bound). The top graph is for desired privacy 
guarantee of e = 0.1, the bottom - for e = 0.5. 

nential mechanism to achieve a good approximation to the 
best possible accuracy achievable by a privacy-preserving so- 
cial recommendation algorithm. Furthermore, observe that 
Corollary[l]merely gives an upper bound on accuracy achiev- 
able in a privacy-preserving manner, but it might be the case 
that tighter lower bounds can be obtained. Hence, in many 
ways, the Exponential and Laplace mechanisms are repre- 
sentative of the class of good privacy-preserving mechanisms 
one can hope for. 

8. EXTENSIONS AND FUTURE WORK 

8.1 Vertex privacy and non-monotone algo- 
rithms 

We considered the setting of graph based social recom- 
mendations where we wished to maintain private the infor- 
mation about the presence or absence of an edge in the graph 
but our reasoning and results can easily be generalized to a 
setting where we would like to protect the entire identity of 
a node. To achieve that, one would need to strengthen the 
definition of the recommendation algorithm satisfying dif- 
ferential privacy to consider graphs that differ in one node, 
rather than one edge, and adjust the value of t, the number 
of edge alterations to turn a node from the low utility group 
into a node of maximum utility, respectively. 

Furthermore, our results can be generalized to social rec- 
ommendation algorithms that do not satisfy the monotonic- 
ity property. For clarity of exposition, we omit the exact 
statements and proofs of lemmas analogous to Lemmas [T] 




Figure 4: Accuracy achieved by Exponential mechanism 
and predicted by Theoretical Bound as a function of node 
degree, e = 0.5 

and [2] but remark that the statement formulations and our 
qualitative conclusions will remain essentially unchanged, 
with the exception of the meaning of variable t. Without 
the monotonicity property, t would correspond to the num- 
ber of edge alterations necessary to exchange the node with 
the smallest probability of being recommended and the node 
with the highest utility, rather than to the number of edge 
alterations necessary to make the node with the smallest 
probability of being recommended into the node with the 
highest utility, leading to a higher value for t. 

8.2 What if utility vectors are unknown? 

Both the differentially private algorithms we considered 
in Section [6] assume the knowledge of the entire utility vec- 
tor. This assumption cannot be made in social networks for 
various reasons. Firstly, computing as well as storing the 
utility of n 2 pairs is prohibitively expensive, when dealing 
with graphs of several hundred million nodes. Secondly, even 
if one could compute and store them, these graphs change 
at staggering rates, therefore, utility vectors are also con- 
stantly changing. We believe that this is a very important 
and interesting problem. In this section, we explore a simple 
algorithm that assumes no knowledge of the utility vector; 
it only assumes that sampling from the utility vector can be 
done efficiently. 

8. 2. 1 Sampling and Linear Smoothing 

Suppose we are given an algorithm A which is a 7 ap- 
proximation in terms of utility, and not provably private. 
We show how to modify the algorithm A to guarantee dif- 
ferential privacy, while still preserving, to some extent, the 
utility approximation of A. The proof of the following the- 
orem, and a note, are placed in the appendix. 

Definition 6. Given algorithm A = (pi, . . . ,p t , . . . ,p n ), 
algorithm As(x) recommends node i with probability ^f- + 
xpi, where < x < 1 is a parameter. 

Theorem 7. As(x) guarantees ln(l + differential 
privacy and a xj approximation of utility. 

Another idea worth exploring is perturbing the input graph 
(by adding/deleting a fraction of possible edges) and then 



sampling and recommending from it. What is the rela- 
tionship between the extent of perturbation and the util- 
ity/privacy guarantees? 

8.3 Future Work 

Several interesting questions remain unexplored in this 
work. While we have considered some specific utility func- 
tions in this paper, it would be nice to look more. Further, 
our motivation was to look at the most stringent require- 
ment in terms of privacy; however, a natural question is 
to understand utility-privacy trade-offs for certain typical 
graphs that arise in social networks. 

This paper only considers lower bounds and algorithms 
for making one single recommendation. It would be very 
interesting, and important, to explore how the effect on pri- 
vacy compounds with multiple recommendations. Further, 
some edges can be more sensitive than others. Perhaps the 
solution should be methodological - enable opt-in/opt-out 
settings to specify which nodes/edges are private. A closer 
look at such dependences is required. 

Also, most works on making recommendations deal with 
static databases. Social networks clearly change over time 
(and rather rapidly). This raises several issues, such as 
not being able to assume the utility vector is known, sensi- 
tivity changing, privacy impacts of dynamic databases etc. 
Dealing with such temporal graphs and understanding there 
trade-offs would be very interesting. 

Finally, it would certainly be interesting to extend these 
results for weaker notions of privacy than differential pri- 
vacy. For instance, some privacy notions previously defined 
include fc-anonymity, (e, ^-differential privacy, and relaxing 
the adversary's background knowledge to just the general 
statistics of the graph. 
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APPENDIX 

Proof of Lemma 111 

Proof. We initiate the analysis with a simple claim. 

Claim 1. In order to achieve (1 — 5) accuracy, at least 
of the probability weight has to go to nodes in the high 
utility group, so there exists a node x in the low utility group 
ofGi that is recommended with probability of at most e ( n ^_ fc ) , 
e.g. p^ < 



c{n — k) 



Proof. Denote by p + and p the total probability that 
goes to high/low utility nodes, respectively, and observe that 



P Umax + (1 — C 

p~ < 1, hence, p + > ^—^-,p~ 



xf> > J2i u iPi > (! " s ) 1 



and p 



□ 



We now continue the proof of Lemma [T] 

Let Gi be the graph that turns x, found according to the 
Claim above, into a node of highest utility by addition of t 
edges. 

° 2 

By differential privacy, we have ^fj- < e £ . 

Pi 

In order to achieve (1 — 5) accuracy on G*2, at least 
of the probability weight has to go to nodes in the high 
utility group, and hence by monotonicity Pr[x|G2] > ■ 
Combining the previous three inequalities, we obtain: 

c-S „ 
(c — <5)(n — fc) c( 



(k + l)S 



777 



Tfc) Pi 



< e , hence 



1 ,c — 6. , , n - 
e>-(ln(— )+]n(. 



k 



5 ' 

This completes the proof. □ 

Proof of Lemma [2] 

Proof. We first use the concentration axiom to prove 
the following claim. 



Claim 2. If c = ^1 — j^j), then k = 0(/3logn) where 
P is the parameter of the concentration axiom. 



1 

log n 



Proof. Now consider the case when c = 

Therefore, k is the number of nodes that have utility at 
least "^ a * . Let the total utility mass be U = X^i u »- Since 
by concentration, the /3 highest utility nodes add up to a 
total utility mass of f2(l) * U, we have u max > There- 
fore, k, the number of nodes with utility at least " max is at 



most 



U log n 



which is at most 0(/31ogn). □ 



Simplifying 



We now prove the Lemma using Lemma [T] and Claim [2] 
Substituting these in the expression, if we need 1— ^^pjyi 

to be f2(l), then require (k + l)e et to be Q(n — k). (Notice 

that if (k + l)e et = o(n - k), then „_ fc c j"~^ )e . t > c - o(l), 

which is 1 — o(l).). 

Therefore, if we want an algorithm to obtain constant 

approximation in utility, i.e. (1 — S) = then we need 

the following (assuming /3 to be small): 



(0(^logn))e Et = Q((n - 0(0 log n)) 



£ > 



p log n 

log n — log /? — log log n 
t 



e > 



logn — o(logn) 



□ 



Proof of Theorem [2] 

Proof. Lower Bound for Common Neighbors We 

formalize the intuition in terms of an upper bound on t in 
the following claim. 

Claim 3. For common neighbors based utility functions, 
when recommendations for r are being made, we have t < 
d r + 2, where d r is the degree of node r. 

Proof. Observe that if the number of common neighbors 
is the measure of the utility of recommendation, then one 
can make any zero utility node, say x, for source node r into 
a max utility node by adding d r edges to all of r's neighbors 
and additionally adding two more edges (one each from r 
and x) to some node with small utility. This is because the 
highest utility node has at most d r common neighbors with r 
(one of which could potentially be x). Further, adding these 
edges cannot increase the number of common neighbors for 
any other node beyond d r . □ 

We now use this to get the theorem immediately by re- 
placing t in the expression stated previously. □ 

Proof of Theorem [3] 

PROOF. Lower Bound for Sum of Weighted Paths 

The number of paths of length I between two nodes is at 
most d^x- Let x be the highest utility node and let y be the 
node we wish to make the highest utility node after adding 
certain edges. If we are making recommendations for node 
r, then the maximum number of common neighbors with r 
is at most d r . 

Currently denote the utility of x by u x . We know that 
u x < yd r Y2i^3 7 !_1 ^max- (In &ct one can tighten the second 
term as well.) 

We rewire the graph as follows. Any (c— l)d r nodes (other 
than y and the source node r) are picked; here c > 1 is to 
be determined later. Both r and y are connected to these 
(c— l)d r nodes. Additionally, y is connected to all of r's d r 
neighbors. Therefore, we now get the following. 



u y > jcd r 

Now we wish to bound by above the utility of any other 
node in the network in this rewired graph. Notice that every 
other node still has at most d r paths of length 2 with the 
source. Further, there are only two nodes in the graph that 
have degree more than d max + 1, and they have degree at 
most (c+ l)dmax- Therefore, the number of paths of length I 
for I > 3 for any node is at most ((c+l)d max ) 2 - (dmax + l)' -3 - 
This can be further tightened to ((c + l)d max ) 2 ■ (dmax) 1-3 - 
We therefore get the following for any x in the rewired graph, 



Let p" 



u x < -yd r + (c+ 1) 2 ^7 ( 1 4oi 
Now consider the case where 7 < d - . We get 
(c+l) 2 7 2 dLx 



u x < <yd r + 



1 - 7d n 



We now want u y > u x . This reduces to 



(c-l)> 



(c + 1) y n 

1 7^ma5 



Now if 7 = 0(5-= — ) then it is sufficient to have (c — 1) = 
^(7<imax) which can be achieved even with c = f + o(f). 
Now notice that we only added d r + 2(c— l)d r edges to the 
graph. This completes the proof of the theorem. □ 

Comment on relationship between common neigh- 
bors and weighted paths: Since common neighbors is 
an extreme case of weighted paths (as 7 — > 0), we are able 
to obtain the same lower bound (up to o(l) terms) when 
7 is made small (in particular, 7 ~ °(zr — )■ Can one ob- 
tain (perhaps weaker) lower bounds when say 7 = Q( d 1 )? 
Notice that the proof only needs (c — 1) > ^ c ^V^ d Ydmax ■ We 

then get a lower bound of e > — ( ) where d r = a log n. 

Setting 7<i m ax = s, for some constant s, we can find the 
smallest c that satisfies the expression (c — 1) > s ■ 

Notice that this does give a nontrivial lower bound (i.e. a 
lower bound tighter than the generic one presented in the 
previous section), as long as s is a sufficiently small constant. 

Proof of Theorem [6] 

Proof. Utility of Laplace for n — 2: Suppose we 
have two elements, with utility ti and tz, respectively, where 
t 1 > t% wlog. 

Let 4>x (t) denote the characteristic function of the Laplace 
distribution, it is known that <f>x(i) = 1+ ^2 t 2 ■ Moreover, it 
is known that if X\ and X 2 are independently distributed 
random variables, then </> Xl +x 2 (t) = 4> Xl {t)<t)x 2 {t) = n+p^p ■ 
Using the inversion formula, we can compute the pdf of 
X = X\ + X2 as follows: 



f x (x) = F' x (x) 



1 

2n 



*<px(t)dt 



For x > 0, the pdf of Xi + X 2 is fx (x) = ± (1 + f )e~ ? and 



the cdf is Fx (x) 



What is the probability that element 1 is recommended? 
It's the Pr[h + Xi > t 2 + X 2 ] = Pr[X 2 - Xi < ti - t 2 ] = 



- e(ti - t3) (! + (*i-*2)) = 1 



i p -e(tl 



-t 2 ) _ e(ti-t 2 ) 
e(*l-*2) 



4e e( . - - 

Hence, the Laplace mechanism recommends node 1 with 
probability 

1 _ I P - £ (*i-*2) _ £ (A ~ f 2) 
2 4 e £(ti-t 2 )' 

from which the desired statement about Al's utility fol- 
lows. □ 



+ xpi . We have 



X if 1 x 

— <Pi < h x, 



since < pi < 1. 
The utility of A s is 



n n n 

[/(A S ) = ^u^' = ^(i^) Mfe+ ^ 



-+£7 > xj 



where we use ~}2 k u k — 1 and YlPkUk = 7. 

For the privacy guarantee, note again that the upper and 
lower bounds on p\ hold for any graph and utility function. 
Therefore, the change in the probability of recommending i 
for any two graphs G and G' that differ in exactly one edge 
is at most 



Pi(G) 
Pi{G') 



x + 



1 + 



Therefore, As is ln(l+ ^Pf- ) -differentially private. This com- 
plete the proof. 

Further, note, to guarantee 2e- differentially privacy for 
As(x), we need to set the parameter x so that ln(l + 5^7-) = 
2c Inn (rewriting e = chin), namely 



n 2c — 1 + n 

The algorithm As guarantees a utility of at least x-y. 



□ 



Proof of Lemma [3] 

Proof. Suppose the variations on the common neighbor 
functions permitted are Ui = di/z, where di is the number of 
common neighbors node i has with the target node, and z is 
a scaling constant. Pick c = 1, meaning that all nodes except 
k have zero utility. Then Uo < w ma x(l — 5) < it max (l — 

n=k \ _ (fc + l) e c"max 

n-H(Hl)e™™ I "max n _ k+ (fc + l) e eu m ax ■ 

Under our restricted privacy definition, the sensitivity of 
the scaled number of common neighbors utility function is - . 

Ue ^ ^max n _^ e t!u mll = u max n _txt £wm^ — ^max 



- fc + fce e 



Hence ^ > and exponential algorithm gives a (k + 1) 
approximation of utility, which could be a fairly good ap- 
proximation, if k is small compared to n, which is what we 
expect, in real-world social network graphs. □ 



n-k + (k + l)e~ 



Proof of Theorem [7] 

Proof. Sampling and Linear Smoothing 



